Classification on Soft Labels Is Robust against Label Noise

نویسنده

  • Christian Thiel
چکیده

In a scenario of supervised classification of data, labeled training data is essential. Unfortunately, the process by which those labels are obtained is not error-free, for example due to human nature. The aim of this work is to find out what impact noise on the labels has, and we do so by artificially adding it. An algorithm for the noising procedure is described. Not only individual classifiers are studied, but also ensembles of classifiers whose answers are combined, increasing the overall performance. Also, we will answer the question if classifiers trained on soft labels are more resilient to label noise than those trained on hard labels.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploiting Associations between Class Labels in Multi-label Classification

Multi-label classification has many applications in the text categorization, biology and medical diagnosis, in which multiple class labels can be assigned to each training instance simultaneously. As it is often the case that there are relationships between the labels, extracting the existing relationships between the labels and taking advantage of them during the training or prediction phases ...

متن کامل

Robust Loss Functions under Label Noise for Deep Neural Networks

In many applications of classifier learning, training data suffers from label noise. Deep networks are learned using huge training data where the problem of noisy labels is particularly relevant. The current techniques proposed for learning deep networks under label noise focus on modifying the network architecture and on algorithms for estimating true labels from noisy labels. An alternate app...

متن کامل

An Effective Approach for Robust Metric Learning in the Presence of Label Noise

Many algorithms in machine learning, pattern recognition, and data mining are based on a similarity/distance measure. For example, the kNN classifier and clustering algorithms such as k-means require a similarity/distance function. Also, in Content-Based Information Retrieval (CBIR) systems, we need to rank the retrieved objects based on the similarity to the query. As generic measures such as ...

متن کامل

L1 Graph Based Sparse Model for Label De-noising

The abundant images and user-provided tags available on social media websites provide an intriguing opportunity to scale vision problems beyond the limits imposed by manual dataset collection and annotation. However, exploiting user-tagged data in practice is challenging since it contains many noisy (incorrect and missing) labels. In this work, we propose a novel robust graph-based approach for...

متن کامل

Efficient Learning of Classification Models from Soft-label Information by Binning and Ranking

Construction of classification models from data in practice often requires additional human effort to annotate (label) observed data instances. However, this annotation effort may often be too costly and only a limited number of data instances may be feasibly labeled. The challenge is to find methods that let us reduce the number of the labeled instances but at the same time preserve the qualit...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008